On the origin of power law tails in price fluctuations 



J. Doyne Farmer^ and Fabrizio Lillo^'^ 

^ Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501 
^Istituto Nazionale per la Fisica della Materia, Unitd di Palermo, Italy 
(Dated: February 2, 2008) 

In a recent Nature paper, Gabaix et al. presented a testable theory to explain the power law 
tail of price fluctuations. The main points of their theory are that volume fluctuations, which have a 
power law tail with exponent roughly — f .5, are modulated by the average market impact function, 
which describes the response of prices to transactions. They argue that the average market impact 
function follows a square root law, which gives power law tails for prices with exponent roughly 
—3. We demonstrate that the long-memory nature of order flow invalidates their statistical analysis 
of market impact, and present a more careful analysis that properly takes this into account. This 
makes it clear that the functional form of the average market impact function varies from market 
to market, and in some cases from stock to stock. In fact, for both the London Stock Exchange 
and the New York Stock Exchange the average market impact function grows much slower than a 
square root law; this implies that the exponent for price fluctuations predicted by modulations of 
volume fluctuations is much too big. We find that for LSE stocks traded in the electronic market 
the distribution of transaction volumes does not even have a power law tail. This makes it clear 
that volume fluctuations do not determine the power law tail of price returns. 



Gabaix et al. have recently proposed a testable 
theory for the origin of power law tails in price fluctua- 
tions. In essence, their proposal is that they are driven 
by fluctuations in the volume of transactions, modulated 
by a deterministic market impact function. More specifi- 
cally, they argue that the distribution of large trade sizes 
scales as P{V > a;) ~ a;~^, where V is the volume of the 
trade and 7 « 3/2. Based on the assumption that agents 
are profit optimizers, they argue that the average market 
impact function^ is a deterministic function of the form 
r = kV^ , where r is the the change in the logarithm 
of price resulting from a transaction of volume V, k is 
a constant, and /3 = 1/2. This implies that large price 
returns r have a power law distribution with exponent 
a = 7//3 « 3. They argue that their theory is consistent 
with the data, even though their hypothesis about mar- 
ket impact appears to contradict several other previous 
studies 0;l3>lj| in the same markets they study (the New 
York and Paris Stock Exchanges). 



I. PROBLEMS WITH THE TEST OF GABAIX 
ET AL. 

Gabaix et al. 0] present statistical evidence that ap- 
pears to show that the NYSE and Paris data are consis- 
tent with the hypothesis that the average market impact 
follows a square root law. In this section we show that 
their test may have problems in circumstances (such as 



^ One should more properly think of the market impact as a re- 
sponse to the order initiating the trade. That is, in every transac- 
tion there is a just-arrived order that causes the trade to happen, 
and this order tends to alter the best quoted price in the direc- 
tion of the trade, e.g. a buy order tends to drive the price up, 
and a sell order tends to drive it down. 



those of the real data) in which orders have long-memory 
properties. This weakens their test, so that it lacks the 
power to reject reasonable alternative hypotheses and 
may give misleading results. 

Their method to test the hypothesis of square root 
price impact is to investigate i?[r^|I^] over a given time 
interval, e.g. 15 minutes, where r is the price shift and 
V = X^ilf^i is the sum of the volumes of the M transac- 
tions occurring in that time interval. They have chosen 
to analyze rather than r because of its properties un- 
der time aggregation. To see why this might be useful, 
assume the return due to each transaction i is of the form 
Ti = keiVf + Ui, where u is an IID noise process that is 
uncorrelated with Vi, and is the sign of the transaction. 
The squared return for the interval is then of the form 



M 
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Under the assumption that Vi, Vj, e^, and ej are all 
uncorrelated, when /3 — 1/2 it is easy to show that 
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b V, where a and b are constants. 



The problem is that for the real data Vi, Vj, ei, and 
ej are strongly correlated, and indeed, the sequence of 
signs ei is a long- memory process To demon- 
strate the gravity of this problem, we use real trans- 
actions Vi, but introduce an artificial and determinis- 
tic market impact function of the form n = kVf with 
(3 ^ 0.5. We first fix the number of transactions, and then 
repeat the same procedure using a fixed time period. We 
examine blocks of trades with Al transactions, {ei,Vi}, 
i = 1, Af, where = +1 (—1) for buyer (seller) initi- 
ated trades and Vi is the volume of the trade in number of 
shares. For each trade we create an artificial price return 
Vi = keiVf , where fc is a constant. Then for each block 

a/3 



of M trades we compute r = J2iLi — ^ Si=i ^i^i 



and V = J2iLi^i- Since we are using the real order 
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flow we are incorporating the correct autocorrelation of 
the signs and transaction sizes Vi. Figure QJa) shows 
JS[r^|F] for different values of M and /3 — 0.3 for the 
British stock Vodafone in the period from May 2000 to 
December 2002, a series which contains approximately 
10® trades. We see that for small values of M the quan- 
tity i?[r^|y] follows the artificial market impact func- 
tional form ^^[r^ly] ~ 1/^'^ = V^-^, but when M is large 
the relation between i?[r^|y] and V becomes linear. The 
value M = 40 is roughly the average number of trades 
in a 15 minute interval. We also show error bars com- 
puted as specified by Gabaix et al. We cannot reject 
the null hypothesis of a linear relation between i?[r^|y] 
and V with 95% confidence, even though we have a large 
amount of data, and we know by construction that /3 is 
quite different from 1/2. We have also performed tests 
on other stocks, which give similar results. 

One can ask whether it makes a difference that we used 
a fixed number of transactions rather than a fixed time 
interval. To test this we repeat the procedure using a 
fixed time interval of 15 minutes. Figure^^b) shows the 
result. We see an even clearer linear relation between 
i?[r^|y] and V than before, so that the test once again 
fails. 

Why doesn't this test work? To gain some understand- 
ing of this, we repeat the same test but shuffle the order 
of the data, which breaks the correlation structure. As 
shown in Figure ^c), the result in this case is far from 
linear even when M = 40, and the test easily shows that 
the market impact does not follow a square root law. 
Thus, we see that the problem lies in the autocorrelation 
structure of the real data. 

In conclusion our numerical simulations show that the 
linearity test of £J[r^|y] lacks power to test for a square 
root market impact with data containing the correlation 
structure of real data. In fact, even a deterministic mar- 
ket impact like r V^^^ is consistent with the relation 
i?[r^|y] = a + b V for a sufHciently large number of 
trades. Doing this for a fixed time interval rather than 
a fixed number of trades time makes this even more evi- 
dent. Thus the test of Gabaix et al. provides no evidence 
that the average market impact follows a square root law. 



II. PLACING ERROR BARS ON THE 
AVERAGE MARKET IMPACT 

While there have been many previous studies of aver- 
age market impact, they have not included the statistical 
analysis needed to assign good error bars. In this section 
we present results about average market impact at the 
level of individual ticks. We show that it does not gen- 
erally follow a square root law, and that it varies from 
market to market and in some cases from stock to stock 
in a substantial and statistically significant way. 

Realistic error bars for the average market impact are 
difficult to assess due to the fact that volatility is a long- 
memory process 0, 13 • That is, its time series has a 




FIG. 1: A demonstration tliat the statistical test of Gabaix 
et al. P] fails due to the strong autocorrelations in real data. 
The expected value of the squared price return, i5[r'^|y], is 
plotted as a function of total transaction size V — X/i-i 
where Vi is the size of transaction i. Each transaction causes a 
simulated market impact of the form = keiVf, to generate 
total return r — ri. The transaction series Vi and are 

from the real data from the electronic market for the British 
stock Vodafone, and contain roug hly 10** events. The error 
bars are the 95% confidence intervals computed following the 
procedure specified by Gabaix et al. (a) shows the results 
for a fixed number of transactions, with M varying from 2 to 
40; the curves are in ascending order of M; (b) is the same 
using a fixed time interval of 15 minutes, with variable M; 
and (c) is the same as (a) with the order of the transactions 
randomly shuffled. For (a) and (b) we see straight lines for 
large M, indicating that the test is passed, even though by 
construction the market impact does not follow the r ~ V*^'^ 
hypothesis, whereas for the shuffled data the test quite clearly 
shows us that the hypothesis is false. 
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slowly decaying power law autocorrelation function that 
is asymptotically of the form t~'^, with k < 1 so that the 
integral is unbounded. This makes error analysis com- 
plicated, since data from the distant past have a strong 
effect on data in the present. Because volatility is long- 
memory, the price returns that fall in a given volume bin 
Va, which are by definition all of the same sign, are also 
long-memory. This means that the errors in measuring 
market impact are much larger than one would expect 
from intuition based on an IID hypothesis. 

We analyze the market impact only for orders (or por- 
tions of orders) that result in immediate transactions. We 
call the portion of an order that results in an immediate 
transaction an effective market order, and for the remain- 
der of the paper Vi represents effective market order size 
rather than transaction size. Each order of size Vi gen- 
erates a price return — logpa — logpb, where pb is the 
midpoint price quote just before the order is placed and 
Pa is the midpoint price quote just after. We analyze buy 
and sell orders separately. The electronic (SETS) data 
for the LSE has the advantage that the data set contains 
a record of orders, and so we can distinguish buy and 
sell orders unambiguously, but has the disadvantage that 
it omits trades made in the upstairs market^. For the 
NYSE data we use the trades and quotes (TAQ) data to 
infer orders and their signs using the Lee and Ready algo- 
rithm to identify orders we lump together all trades 
with the same timestamp and order code. To estimate 
the average market impact we sort the events {Vi, Vi) 
with the same sign into bins based on Vt and plot the 
average value of Vt for each bin against the average value 
of Ti, as shown in Figure |21 We choose the bins so that 
each bin has roughly the same number of points in it. 

To assign error bars for each bin we use the variance 
plot method 0- For each bin we split the events into m 
subsamples with n = K/m points, where K is the num- 
ber of records in the bin. The subsamples are chosen to 
be blocks of values adjacent in time. For each subsam- 
ple i we compute the mean i — 1, ...,m. Then we 

compute the standard deviation of the which we in- 
dicate as cr("). By plotting a'"' versus n in a log-log plot 
we compute the Hurst exponent H by fitting the data 
with a power-law function cr'^"^ = An^~^. We compute 
the error in the mean of the entire sample of K points 
by extrapolating the fitted function to the value m — K, 
i.e. a = A K^~^ where A and H are the ordinary least 
square estimate of the parameters A and H . Interest- 
ingly, for smaller values of Vt we find Hurst exponents 
substantially larger than 1/2, whereas for large values of 




volume 



FIG. 2: Market impact function for buy orders of three stocks 
traded in the New York Stock Exchange (blue, dashed) and 
three stocks traded in the London Stock Exchange (red, solid). 
Orders of similar size Vi are binned together; on the horizontal 
axis we show the average volume of the orders in each bin, and 
on the vertical axis the average size of the logarithmic price 
change for the orders in that bin. In both cases comparison 
to the dashed black line in the corner, which has slope 1/2, 
makes it clear that the behavior for large volume does not 

1/2 

follow a law of the form ri ^ V^ . Error bars are computed 
using the variance plot method Q as described in the text. 



Vi the Hurst exponents are much closer to 1/2. When 
H > 1/2 the error bars are typically much larger than 
standard errors'^. 

In Figure |5] we show empirical measurements of the 
average market impact for the New York Stock Ex- 
change and for the London Stock Exchange. We con- 
sider three highly capitalized stocks for each exchange, 
Lloyds (LLOY), Shell (SHEL) and Vodafone (VOD) for 
the LSE, and General Electric (GE), Procter & Gamble 
(PG) and AT&T (T) for the NYSE. For LSE stocks we 
consider the period May 2000- December 2002, while for 
NYSE stocks we consider the time period 1995-1996. The 
data for the NYSE are consistent with results reported 
earlier without error bars Q , while the LSE market im- 
pact data is new. The NYSE data clearly do not follow a 
power law across the whole range, consistent with earlier 
results in references dQ. While PiVi) « 0.5 for smaU 
Vi, for larger Vi it appears that (3{Vi) < 0.2. As shown 
in reference Q, this transition occurs for smaller values 



^ The relative impact on price formation of the upstairs and down- 
stairs markets is not clear. On one hand, the upstairs market 
contains the largest trades. On the other hand, because these 
trades are arranged privately and then printed in the transac- 
tion record later, they may not have as large an effect on price 
formation. 



Since we choose the bins to have roughly the same number of 
points, the difference in Hurst exponent between bins with large 
and small V cannot be due to a difference in the mean interval 
between samples. 
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of Vi for stocks with lower capitalization. Thus, the as- 
sumption that /? = 0.5 breaks down for high volumes, 
precisely where it is necessary in order for the theory of 
Gabaix et al. to hold. For the London data the power 
law assumption seems more justified across the whole 
range, but the exponent is too low; a least squares fit 
gives P « 0.26. While we have not attempted to com- 
pute error bars for the regression, a visual comparison 
with the error bars of the individual bins makes it quite 
clear that /3 = 1/2 is inconsistent with either the London 
or the NYSE data. A separate study of eleven LSE stocks 
gives P = 0.26 ± 0.02 for buy orders and 0.23 ± 0.02 for 
sell orders |Q| ; in as yet unpublished work this has been 
extended to 50 stocks, with similar results. Our earlier 
study for the NYSE was based on 1000 stocks 3j. It is 
clear that the average market impact functions are qual- 
itatively different for LSE and NYSE stocks, and that 
for NYSE stocks the functional form varies with market 
capitalization 3] . 

Even if we abandon the prediction that the average 
market impact is a square root law, one might imagine 
that we could explain fluctuations in prices in terms of 
fluctuations in volume modulated by average market im- 
pact of the form = kvf . However, if this were true, 
for the NYSE the predicted exponent for price fluctua- 
tions would be a = j/jS « 1.5/0.25 = 6, which is much 
too large to agree with the data. (A typical value 
is a w 3). To make matters even worse, the power law 
hypothesis for volume or market impact appears to fail 
in some other markets. In the Paris Stock Exchange 
Bouchaud et al. 4] have suggested that the average 
market impact function* is of the form logV^i, yielding 
/3 — > in the limit as — > oo. For the London Stock 
Exchange the power law hypothesis for average market 
impact seems reasonable, but with an exponent signifi- 
cantly smaller than 1/2. Moreover, the volume for the 
electronic market is not power law distributed, as dis- 
cussed in the next section. 

Note that we are making all the above statements for 
individual orders, whereas many studies have been done 
based on aggregated data over a fixed time interval. Ag- 
gregating the data in time complicates the discussion, 
since the functional form of the market impact generally 
depends on the length of the time interval. Hence it is 
more meaningful to do the analysis based on individual 
transactions. 
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FIG. 3: (a) The probability density of normalized volume 
for three typical high volume stocks in the LSE, LLOY (red, 
circles), SHEL (blue, squares), and VOD (green, triangles) in 
the period May 2000- December 2002, based on data from the 
electronic exchange. There are approximately 10® data points 
for each stock, (b) 1 — P(r), where P{r) is the cumulative den- 
sity function of returns induced by the same transactions in 
(a). For the normalized volume there is no clear evidence for 
power law tails; in contrast for returns this is quite plausible. 
Furthermore, the volume distributions are essentially iden- 
tical, whereas the return distribution for VOD decays more 
steeply than the others. 



III. VOLUME DISTRIBUTION 

The theory of Gabaix et al. explains the power law of 
returns in terms of the power law of volume, so if vol- 



* For the NYSE the logarithmic form for average market impact 
is a reasonable approximation for small Vi, but breaks down for 
higher Vi 



ume doesn't have a power law, then returns shouldn't 
either. The existence of a power law tail for volume 
seems to vary from market to market. For the NYSE 
we confirm the observation of power law tails for volume 
reported earlier • However in Figure O we show the 
distribution of volumes for three stocks in the electronic 
market of the LSE. In order to compare different stocks 
we normalize the data by dividing by the sample mean 
for each stock. All three stocks have strikingly similar 
volume distributions; this is true for the roughly twenty 
stocks that we have studied. There is no clear evidence 
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for power law scaling, even though the power law scaling 
of the corresponding return distributions shown in Fig- 
ure Efb) is rather clear. If one attempts to fit lines to 
the larger volume range of the curve (roughly 10^ — 10^), 
the exponent of the cumulative distribution correspond- 
ing to Figure |Sl[a) is highly uncertain but it is at least 
3, which together with the measured values of [3 would 
imply a w 3/0.3 « 10. In contrast, the measured ex- 
ponents for Figure |2|^b) are roughly 2.2, 2.5, and 4.3 for 
SHEL, LLOY, and VOD respectively. It is noteworthy 
that VOD has a much larger a than the other stocks, 
even though it has essentially the same volume distribu- 
tion and a similar volume distribution; if anything from 
Figure [3 it's /3 is larger than that of the other stocks, 
which according to a = 7//? would imply a smaller a. 
This provides yet more evidence that the power law tails 
of returns are not driven by those of volume. 

Note that one of the differences between the NYSE and 
the LSE data examined here, which may be the underly- 
ing cause of the difference in their distributions, is that 
the data from the NYSE includes upstairs market trades, 
whereas the LSE data does not. 



IV. CONCLUSION 

We have shown that the conclusions of Gabaix et al. 
are suspect for three different reasons: First, their 
statistical analysis in claiming the existence of a square 
root law for average market impact lacks power to re- 
ject alternative hypotheses in the presence of the strong 
autocorrelations that are present in real data; Second, 
new measurements of the average market impact with 
proper error bars show that it does not follow a square 
root law; Third, for electronic data the London Stock 



Exchange the distribution of volumes does not have a 
power law tail, and there are substantial variations be- 
tween the return distributions that are not reflected in 
variations in volume or average market impact. Thus, 
it seems that the distribution of large price fluctuations 
cannot be explained as a simple transformation of volume 
fluctuations. 

This leaves open the question of what really causes the 
power law tails of prices. We believe that the correct ex- 
planation lies in the extension of theories based on the 
stochastic properties of order placement and price forma- 
tion 12, 13, 14], which naturally give rise to fluctuations 
in the response of prices to orders. Further work is clearly 
needed. 

Note added in press: In a recent study it has been 
shown that large price fluctuations in the NYSE and the 
electronic portion of the LSE are driven by fluctuations 
m liquidity 0. That is, if one matches up returns with 
the orders that generate them, the conditional distribu- 
tion of large returns is essentially independent of order 
size. This has been confirmed for the NYSE and Island 
by Weber and Rosenow ^6]. The idea that the tail of 
prices is driven by fluctuations in liquidity rather than 
fluctuations in the number of trades was implicitly sug- 
gested earlier by results of Plerou et al. 0| . 
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